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Abstract 



Coefficient alpha is a simple and very useful index of test reliability that is 
widely used in educational and psychological measurement. Classical statistical 
inference for coefficient alpha is well developed. This paper presents two methods 
for Bayesian statistical inference for a single sample alpha coefficient. An 
approximate analytic method based on conjugate distributions is derived. This 
method is easy to compute. A second method uses MCMC methodology as 
implemented by the computer program WinBUGS. WinBUGS may be 
downloaded for free from the Internet, and this paper includes WinBUGS code 
for making Bayesian inferences about coefficient alpha. Psuedo-randomly 
generated data are use to compare the two Bayesian methods to each other and 
both of those methods to the classical method. The results indicate that the two 
Bayesian methods work well so long as the number of items and examinees are 
not too small. 
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Bayesian Statistical Inference for CoeflBcient Alpha 



Frequentist inferential procedures for coefficient alpha are well developed. 
The paper by Feldt, Woodruff, and Salih (1987) presents a summary of the 
different methods, gives complete references to the area, and also discusses the 
robustness of the procedures to violations of their assumptions. A more recent 
paper by Hakstian and Barchard (2000) also evaluates the robustness of the 
procedures and references other recent studies on the robustness of the 
procedures. The methods depend on normal distribution theory for random and 
mixed effects ANOVA. Even though the methods are often employed using 
dichotomous item response data, the procedures generally perform well when 
there are reasonable numbers of items and examiness. Two papers by Feldt and 
Ankenmann (1998, 1999) contain classical power curves and tables for testing the 
difference between two independent sample alpha coefficients or two dependent 
sample alpha coefficients. A more recent paper on the samphng distribution of 
coefficient alpha is VanZyl, Neudecker, and Nel (2000). Their paper takes a 
multivariate approach instead of the ANOVA approach used in the earlier 
papers. 

The present paper derives two Bayesian procedures for making inference 
about a single alpha coefficient. Our proced vires also depend on normal 
distribution theory for random and mixed effects ANOVA. Our first procedure is 
an approximate analytic method based on conjugate distributions. This method is 
relatively simple and easy to compute. Our second method uses MCMC 
methodology as implemented by the WinBUGS computer program. Example 
WinBUGS code is included in Appendix B. We compare the two Bayesian 
procedures to each other and both to a frequentist method. 

One advantage gained from using Bayesian techniques is the ability to 
incorporate disparate but relevant information into the analysis by way of the 
prior distribution. Another advantage is the abifity to combine data from 
different analyses by using the posterior distribution obtained from a previous 
analysis as the prior distribution for the next analysis. These properties can be 
useful to test developers creating new tests. If the new tests are related to 
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previous tests, then information about the reliability of the earlier tests can be 
used to develop an initial prior distribution for the new tests. In addition, test 
developers initially may have to use shorter versions of the tests and administer 
these versions at different times and to different small groups of students. In 
these situations inference can be updated by using the posterior distribution 
obtained from an earher analysis as the prior distribution for the next analysis. 
Test users may want to calculate the reliability of a test in a specific group of 
examinees of special interest to them, and the test may be administered to only a 
few examinees at any one time, but at regular intervals over time. The Bayesian 
process of updating the inference from prior to posterior to new prior could prove 
convenient in such situations. 

Feldt’s inferential procedures for alpha is based on the demonstration by 
Hoyt (1941) that the sample alpha coefficient can be computed from the observed 
student mean square and the observed interaction mean square of a two-way 
students by items ANOVA with one observation per cell. Hoyt’s result is an 
algebraic identity. It is true for any two-way table of numbers with one 
observation per cell. 

Using Hoyt’s 1941 result, Feldt (1965) developed a frequentist procedure for 
a single alpha coefficient based on ANOVA normal distribution theory. He 
considered the two-way examinees by items random effects ANOVA model with 
one observation per cell, but his results also can be valid under a mixed model. In 
practice examinees often are randomly sampled from a large population of 
examinees and the same is sometimes true for items, though the samphng of 
items is not always strictly random. In these situations the random effects 
ANOVA model is most appropriate. In other situations only the items actually 
administered are of interest so items are treated as a fixed effect and the mixed 
ANOVA model should be used. 

VanZyl, Neudecker, and Nel (2000) derive the same result as Feldt, but 
their derivation is based on the multivariate form of the mixed ANOVA model 
that assumes a compound-symmetric covariance matrix. The derivation of our 
analjdic Bayesian method is similar to the derivation used by VanZyl et. al. 
(2000), but we use the two-way random effects ANOVA model with WinBUGS. 
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Methodology 

In what follows, the true and sample values of coefficient alpha can take 
values between — oo and one. Negative values for coefficient alpha can occur 
when the inter-item correlations are negative. See VanZyl, Neudecker, and Nel 
(2000) for additional discussion of this issue. We denote coefficient alpha by p 
and we take as prior distribution for p 



p{p) = - 1^(1 - a' > 0, > 0, - oo < p < 1. 

r(a) 



( 1 ) 



This is a gamma type distribution and we denote it (a , . It is convenient 

to indicate our prior knowledge about p by specifying a likely value for p. We 
then can indicate our confidence in this value by specifying a hypothetical prior 
sample size upon which the value is based. We denote the prior mean oi p as r, 
and we assign it our prior estimate of the value of p. We next define n -f 1 as 
the size of a hypothetical prior sample that indicates the strength of our prior 
belief. Finally we take a = n j2 and 0 = ny|^2 (l — r . Using E, M, and V to 

denote mean, mode, and variance we find that 

E(p) = r', (2) 



M(p) = 



n -2 



n 



r H — r, and 
n 



( 3 ) 



_2(l-rt 



V(p)= ~'~ , ' . (4) 

n 

Having specified values for n and r we then can compute fi-om equations 
(3) and (4) the prior mode and prior variance for p. If n is less than three the 
prior density will be J shaped and without a weU-defined mode. Taking n == 0 
yields an improper non-informative prior density for p . Note that coefficient 
alpha cannot by computed firom a sample of size one. 

We recommend specifying the prior mean of p and not the prior mode of p. 
When n is small the prior mean and prior mode can have quite different values 
and then better results are obtained by specifying a value for the prior mean 
rather than the prior mode. 
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Before combining our prior with the likelihood, it is convenient to transform 
p to T where r = y(l — p) . This yields as prior distribution for r 



p(r) = 



T > 0. 

r(-) 



( 5 ) 



This is an inverse gamma distribution and we denote it IG^ {oi , 0^. 

As previously mentioned Feldt (1965) and VanZyl, Neudecker, and Nel 
(2000) derived a frequentist distribution theory for coefficient alpha. Let 
t = 1/(1 — r) where r denotes the sample alpha coefficient. For a sample that has 



n+1 examinees and m+1 items they showed that 

a \ 

— r) 

From a Bayesian perspective this F distribution is a marginal or integrated 
likeUhood (Bernardo & Smith, 1994), and we derive it vmder a Bayesian model in 
Appendix A. 

For nm large the F„„^ distribution is well approximated by a 
distribution and our Bayesian method of inference is baaed on this 
approximation. Considered as a function of rwe write the xl based likelihood aa 




Z(r|t) = (7) 

where k (n, t) is a function of n and t only and does not depend on r. Inspection 
of equation (7) shows that it is the kernel of an inverse gamma distribution for r. 
The inverse gamma distribution is closed under multipUcation; consequently, 
because our prior distribution for r is also inverse gamma, it follows that 
appUcation of Bayes theorem yields a posterior inverse gamma distribution for r. 
We denote the posterior IG^ (ol ,P"Y Multiplying (5) and (7) together as dictated 

by Bayes theorem gives the following values for a” and 0 : 

„ iri + n) 

OL = and 

2 




1 


n 


n 

1 


2 


l(l-r) 


(l-r) 




( 9 ) 
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We calculate the posterior distribution of p by malcing the transformation 
p = y(l — r) . This yields posterior distribution for p. Posterior 

summary indices for p can be calculated from equations (8) and (9). They are: 



Hr I ’■) = 



nr (1 — r) 4- nr{l — r 'j 
n {l — r) + n(l — r'^ 



t \ nV(l-r) + nr(l-r') + 2(l-r')(l-r) 

M(p r) = ; , and 

n (1 — r) 4- n (l — r ) 



y{p\r) = 



2{n + n)(l — r'^ (1 — r) 



( 10 ) 

( 11 ) 

( 12 ) 



Jn’ (1 — r) 4- n (l — 

If an improper non-informative prior is selected by taking n = 0 then equations 
(10), (11), and (12) will reduce to equations (2), (3), and (4) but with n and r 
replaced by n and r. 

The frequentist sampling distribution for the sample alpha coefficient can be 
derived from (6). It is 



p[r\p) = 



“r(f)r(=)d-.) 



l-p 


IH 


1 

• 4 



.1 — r. 




m (1 — r) 



(i<H 



— oo < r < 1. (13) 



and sampling distribution summary indices for r are: 



E(r|p) = 
M{r I p) = 
M(r I p) Ri 



n 



P- 



,n — 2 
'' 2 



n 



,n 4- 2; 



nm 



,n — 2 
P + 



p for large n. 



1 - 



n 



n 



<n 4 - 2 , 

V(r I p) = 



R + 



4 - 2j 

2n{nm 4- n — 2) 



,n4-2ji. nm 
for large nm, and 



(14) 



(15) 



i^-pf 



(16) 



m(n-4)(n-2)y 

V(r I p) Ri 2(m 4- 1)(1 — p)_ large n. 

nm 

Our Bayesian method is based on an approximation to the likelihood given 
in (6), but the samphng distribution of alpha given in (13) is derived exactly 
from the likelihood given in (6). Some idea of the accxnracy of our approximation 
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can be obtained from a comparison between the posterior distribution of alpha 
based on a non-informative prior and the exact sampling distribution of alpha. 
Both densities depend only on the data and on the data only through the 
sufficient statistic r. Comparison of results between the two procedures will be 
presented in the Results section. 

Our second Bayesian procedure uses the MCMC method as implemented by 
the computer program WinBUGS. We use the normal theory two-way examinees 
by items ANOVA model with one observation per cell — the same model 
originally used by Feldt (1965). We treat both examinees and items as random 
effects. The model is 



y.. = n + a. +b.+ e.. (17) 

where a denotes the effect of the i-th examinee, b denotes the effect of the >th 
item, and e.. denotes the the i^th error component. Following Hoyt (1941) we 
note that 



r — 



MS „ - MS. 



and from Feldt (1965) we have 



MS. 



{m + l)a] 



(18) 



(19) 



" (m + l)al+a^. 

The symbols just introduced have their usual meanings tmder the two-way 
random effects ANOVA model (See Box and Tiao (1973) or Sahai and Ageel 
(2000)) and they are defined in appendix A of this paper. 

To use WinBUGS we first needed to generate pseudo-random samples of 
observations based on the model in equation (17). We took //= 0 and generated 
values of a from a Normal(0, al) distribution, values of b from a Normal(0, a^) 
distribution, and values of e from a Normal(0, af ) distribution. The values of the 
two variance components, al and cr^ , were chosen so as to give specific values 
for p. Nine such samples were generated, and their characteristics are presented 
in Table 1. The computer program JMP from the SAS Institute was used to 
generate the samples. 
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TABLE 1 



Characteristics of the Nine Pseudo- Randomly Generated Samples 



p=m+l 


N=n+1 


P 


r 






10 


10 


0.5 


0.48 


0.057 


0.46 


10 


20 


0.58 


0.056 


0.44 


10 


40 


0.53 


0.082 


0.44 


20 


10 


0.7 


0.65 


-0.059 


0.46 


20 


20 


0.73 


-0.047 


0.46 


20 


40 


0.71 


-0.015 


0.45 


35 


10 


0.9 


0.92 


-0.081 


0.52 


35 


20 


0.92 


-0.070 


0.51 


35 


40 


0.91 


-0.018 


0.52 



We input into WinBUGS the model given in equation (17) along with the 
accompanying normal distribution assumptions for the model effects. The basic 
model parameters are ^ and the three variance components (though WinBUGS 
uses precisions which are the inverses of the variances). The parameter of interest 
is p and as can be seen in equation (19) it is a function of the two variance 
components erf and a ] . WinBUGS uses the distributions of the basic model 
parameters and the data to generate the marginal posterior distribution of the 
functional parameter p. 

We wanted to compare the WinBUGS method to the frequentist method so 
we gave the basic model parameters very diffuse, nearly non-informative prior 
distributions. We gave /xa Normal(0.0, 10,000) prior distribution and the inverses 
of the three variance components Gamma(0.000l, 0.0001) distributions which 
have means of unity and standard deviations of 100. An annotated example of 
our WinBUGS code is presented in Appendix B. 



. Results 

We compare the classical sampling distribution of r, as given in equation 
(13), with the posterior distribution of p obtained from our approximate Bayesian 
method using an improper non-informative prior. In Figures 1 through 9 in 
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Appendix C both densities axe plotted for the values of alpha, m, and n given in 
Table 1. Inspection of Figures 1 through 9 show that when n is small both tails of 
the sampling distribution of alpha are shifted left in comparison to the tails of 
the Bayesian posterior distribution. The left tail of the sampling distribution can 
be especially elongated when n is small. This causes the Bayesian posterior 
variance to be smaller than the sampling distribution variance even though the 
shapes of the two distributions are closely matched around their nearly equal 
modes. 

Inferential statistics for p from the classical sampUng distribution method 
(F), our approximate Bayesian Gamma method (G), and the WinBUGS program 
(W) are given in Table 2. These statistics are for the nine pseudo-randomly 
generated samples. The WinBUGS results in Table 2 appear to match the 
classical method results more closely than the results of our approximate 
Bayesian method (with the exception of the standard deviation in the 10 item, 10 
examinee sample). But, as just noted. Figures 1 through 9 show that our 
approximate Bayesian density and the classical sampUng density have nearly 
identical modes, and all three methods quickly converge as m and n increase. 



TABLE 2 

Means and SB’s for the Three Different Methods: F, G, and W 



p=m+l 


N=n+1 


r 


F 

mean 


G 

mean 


W 

mean 


F 

sd 


G 

sd 


W 

sd 


10 


10 


0.48 


0.33 


0.48 


0.31 


0.44 


0.25 


0.25 


10 


20 


0.58 


0.53 


0.58 


0.50 


0.18 


0.14 


0.19 


10 


40 


0.53 


0.51 


0.53 


0.50 


0.12 


0.11 


0.13 


20 


10 


0.65 


0.55 


0.65 


0.56 


0.29 


0.17 


0.22 


20 


20 


0.73 


0.70 


0.73 


0.71 


0.11 


0.09 


0.10 


20 


40 


0.71 


0.69 


0.71 


0.70 


0.08 


0.07 


0.07 


35 


10 


0.92 


0.89 


0.92 


0.91 


0.07 


0.04 


0.04 


35 


20 


0.92 


0.91 


0.92 


0.92 


0.03 


0.03 


0.03 


35 


40 


0.91 


0.91 


0.91 


0.91 


0.02 


0.02 


0.02 
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These results indicate that our approximate Bayesian method performed 
reasonably well when there were at least ten items and ten examinees. Its 
performance was excellent when there were at least twenty items and twenty 
examinees. Whenever a Bayesian approach is preferred we recommend the use of 
oin: approximate Bayesian method so long as m and n are not too small. Our 
approximate method is very easy to compute, but the WinBUGS computer 
program, which can be down loaded for free from the Internet, can also be used 
to obtain a marginal posterior distribution for alpha. Our Bayesian inferential 
procedinres are based on the same models used for frequentist inference for alpha. 
Past studies previously mentioned have shown the frequentist methods to be 
robust to violations of their assumptions. It therefore is reasonable to expect oim 
Bayesian methods to be similarly robust to violations of their assumptions. 
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Appendix A 



Derivation of the Marginal Likelihood 
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When nuisance parameters are present in a Bayesian analysis they can be 
carried through to the posterior distribution and then integrated out. Bernardo 
and Smith (p. 245, 1994) present an alternative approach. They suggest 
integrating the nuisance parameters out of the likelihood and then using the 
resulting marginal or integrated hkelihood for a Bayesian analysis. We now follow 
their method to derive the likelihood given in equation (6). 

For notational convenience let p = m + 1 and N = n + 1 . Let the column 
vector, X. , contain examinee i’s responses to the p items. We assume that the x^ , 
i=l, ..., N, constitute a random sample from a MVN(0,I3) distribution where H 
has the compound symmetric form 

E = (1 - p, ) I + ^V. jj' = jj'. (20) 

In equation (20) 53 is the inter-item covariance matrix, I is a p by p identity 
matrix, and is a p by p matrix of all ones. All the items have variance <f>^ and 
Pg is the common inter-item correlation. The two variance components, and 
al , relate to the ANOVA model of Feldt (1965). We include both 
parameterizations so the reader can relate the final results to the likelihood given 
in (6) and the model given in equations (17), (18), and (19). More specifically, 
is the error variance component, and erf = <f>^ p„ is the examinee 
variance component and also the common inter-item covariance. Under this 
model p„ is the correlation between any two items and also the reliability of a 
single item. 

The sample mean vector and covariance matrix are: 

N 

X = and (21) 

i 

S = ^"‘Eh-5)(==,-*)' (22) 

t 

The sum of squares and cross-products matrix is V = NS . The statistics x and V 
are independent and constitute sufficient statistics for 0 and 13 . The distribution 
of X, denoted /(x 1 0,13) , is MVN (0, AT~'I3) and V has a Wishart(I3,n) 
distribution that we write as /(V|53, n). The posterior distribution of a 
parameter depends on the data only through its sufficient statistic. So following 
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Press (1982, pp. 186-187) we take as the likelihood in this situation the joint 
distribution of the sufficient statistics x and V. Because they are statistical 
independent their joint distribution is just the product of their marginal 
distributions. Hence, 



/(x.V|e.S) = /(V|E,n)/(x|e,S) 



fe(p,n)V^"-^-^)/^exp( ^"^ — ) 


♦ 


exp|--^(0-x)'s ^(0-x)j 






(271)"/' 



where k{p,n)is the constant term for the Wishart distribution. 

There is symmetry between x and 0 in the above normal distribution. If we 
take the improper non-informative prior distribution /(0 | S)=l for the nuisance 
parameter 0 and then integrate equation (23) with respect to it, all that will 
remain is the Wishart marginal likelihood for V and that is free of 0. The proper 
conjugate prior for 0 in the above likelihood is MVN(|ji, ^S) where p. is our 
best apriori estimate of 0 and i/ measmres the strength of om: belief in that 
apriori estimate. Integrating equation (23) with respect to such a MVN prior for 
0 will not give the desired result. The improper non-informative prior must be 
used. Our method forgoes specifying prior information for 0 so as to make simple 
the specifying of prior information for alpha. 

We next follow VanZyl, Neudecker, and Nel (2000) and consider two scalar 
transformations of V along with their sampling distributions: 

^[1 + mp^]<l>^ and (24) 

np n 

tr(V)-n J^ ~ (25) 

nm nm 

These two random variables are statistically independent and constitute sufficient 
statistics for 4>^ and p„. Again, because the posterior distribution of a parameter 
depends on the data only through its sufficient statistic, our marginal likelihood 
for V reduces to the product of the two independently distributed variables 



and MS^. 
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Under the covariance structure given in equation (20) coefficient alpha 
equals reliability rather than just being a lower bound for reliability, and p can be 
expressed as a function of using the Spearman-Brown formula; 

Using the inverse of equation (26) one finds that the ratio of the expected values 
of MS^ and MS^ is 

E(M?J 1 + mft 1 ^ 



E{MS,) 

Using equation (18) shows that 



1 - Po 1 - P 



MS. 



1 — r 



= t. 



Dividing equation (28) by equation (27) gives the final result 



msjms. 



■ zn P 



(28) 



(29) 



E{MS.)/E{MS,) T 

Hence, i is a sufficient statistic for r (and r sufficient for /?), and when interest 
focuses solely on /?this distribution can be used as the likelihood in a 

Bayesian analysis of alpha. 




n, 

L 
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Appendix B 



WinBUGS Code For Inference about CoefBcient Alpha 
Based on a Two-Way Random Effects ANOVA Model 
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In the WinBUGS code below we use the following conventions. WinBUGS 
code words are in all capital letters. The names of user supplied constants or 
initial values are in all lower case italicized letters, xx is a place-holder for where 
the values of the constants and initial values need to be specified. The names of 
variables and parameters are in regular all lower case letters. Note that for 
simplicity we have given a value of zero to the grand mean, /i, instead of treating 
it as a parameter and giving it a prior distribution. 



MODEL 

{ 

FOR(i IN 1: nostuds) 

{ a [i] -DNORM (0 , tau.a) } 

FOR(j IN 1: noitems) 

{ b[j] -DNORM (0, tau.b) } 

FOR(i IN 1: nostuds) 

{ FOR(j IN 1: noitems) 

{ m[i,j] <- 0+a[i]+b[j] 

y [i, j] ~DNORM(m[i, j] , tau.c) } } 
sigma2 . a ^ 1/tau.a 
sigma2.b ^ 1/tau.b 
sigma2.c ^ 1/tau.c 

alpha ( sigma2.a )/( sigma2.a + (sigm2 . c/noitems) ) 
tau . a~DGAMMA (xx, xx) 
tau . b~DGAMMA (xx, xx) 
tau . c-DGAMMA (xx, xx) 



LIST (nos tuds=xx, noitems=xx., 
y= STRUCTURE ( . DATA= 

C (... data values separated by commas ...) , 

.DIM=c (xx,xx) 

) 

) 

LIST (tau . a=xx, tau.b=xx, tau.c=xx) 

The last LIST statement contains initial values for the three variance 
components. DNORM denotes a normal distribution and DGAMMA denotes a gamma 
distribution. 





19 



Appendix C 



Figures 




24 




- 0.50 - 0.25 



.00 



.25 

X 



.50 



.75 



1.00 




- 0.10 .00 .10 .20 .30 .40 .50 .60 .70 .80 , .90 

X 



ERIC 




.10 .20 



.30 



.40 .50 

X 

25 



.60 .70 



.80 




.00 .10 .20 .30 .40 .50 .60 .70 .80 .90 1.00 



X 




.30 .40 .50 .60 .70 .80 .90 

X 




.40 .45 .50 .55 .60 .65 .70 .75 .80 .85 .90 

X 

Er|c 26 



o 

ERIC 




X 




.75 .80 .85 .90 .95 1.00 



X 




.80 .82 .84 .86 .88 .90 .92 .94 .96 

27 



TM033683 




U.S. Department of Education 

Office of Educational Research and Improvement (OERI) 
National Library of Education (NLE) 
Educational Resources Information Center (ERIC) 




NOTICE 

Reproduction Basis 




This document is covered by a signed "Reproduction Release 
(Blanket)" form (on file within the ERIC system), encompassing all 
or classes of documents from its source organization and, therefore, 
does not require a "Specific Document" Release form. 




This document is Federally-funded, or carries its own permission to 
reproduce, or is otherwise in the public domain and, therefore, may 
be reproduced by ERIC without a signed Reproduction Release form 
(either "Specific Document" or "Blanket"). 



EFF-089 (3/2000) 





